Predicting underlying pitch targets for intonation modeling
نویسنده
چکیده
The present paper reports our preliminary attempt on modeling intonation using underlying pitch targets. The underlying pitch targets were derived using a nonlinear regression technique under the pitch target approximation model [17, 19]. We assume that the use of underlying pitch targets can capture the most important intonation patterns while maintaining critical predictive power. Another important aspect of our approach is that we do not rely on pitch accent as a component in the system. To predict the parameters of the underlying targets, we used a recurrent neural network combined with a time-delay window. Comparing the predicted and original pitch targets, the root mean square error (RMSE) is 7.90 Hz, and the correlation coefficient (r) is 0.78. The results are encouraging and suggesting that the use of underlying pitch targets is a promising approach to intonation modeling.
منابع مشابه
Pitch Targets Anchor Chinese Tone and Intonation Patterns
This paper presents a study on the role of pitch targets in the manifestation of Chinese tone and intonation. Pitch targets are particularly measured as (fundamental frequency) peaks and valleys over time. Analysis and perceptual experiments were conducted on 72 sentences, each with almost identical tone mapping, uttered two times by a female native in statements or questions. The tone and into...
متن کاملModeling Improved Prosody Generation from High-Level Linguistically Annotated Corpora
Synthetic speech usually suffers from bad F0 contour surface. The prediction of the underlying pitch targets robustly relies on the quality of the predicted prosodic structures, i.e. the corresponding sequences of tones and breaks. In the present work, we have utilized a linguistically enriched annotated corpus to build data-driven models for predicting prosodic structures with increased accura...
متن کاملMaximum-likelihood dynamic intonation model for concatenative text-to-speech system
In this work we present a Maximum Likelihood (ML) joint pitch curve modeling, inspired by HMM TTS synthesis concept. This model provides an optimal solution for the coarse target intonation curve (3 points per syllable) and incorporates both static and dynamic pitch values for better utterance intonation modeling. The coarse intonation curve may be optionally combined with the original pitch ex...
متن کاملIntonation Components in short English Statements
In this study we attempt to identify the basic components of statement intonation as related t o focus, accent and lexical stress in General American English. Instead of viewing f 0 contours as direct acoustic correlates of intonation components, we regard them as the outcome of implementing different functional components of intonation under various articulatory constraints. Eight American Eng...
متن کاملTiming of experimentally elicited minimal responses as quantitative evidence for the use of intonation in projecting TRPs
In an RT experiment, subjects were asked to respond with minimal responses to prerecorded dialogs and a manipulated version of these dialogs that contained only intonation and pause information. Response delays and, especially, variances were higher to the impoverished, intonation only, stimuli than to the original recordings. It was also found that intonation only utterances ending in a mid-fr...
متن کامل